神经网络在从颜色图像中提取几何信息方面取得了巨大成功。特别是,在现实世界中,单眼深度估计网络越来越可靠。在这项工作中,我们研究了这种单眼深度估计网络对半透明体积渲染图像的适用性。由于众所周知,在没有明确定义的表面的情况下,深度很难在体积的场景中定义,因此我们考虑在实践中出现的不同深度计算,并比较了在评估期间考虑不同程度的这些不同解释的最先进的单眼深度估计方法渲染中的不透明度。此外,我们研究了如何扩展这些网络以进一步获取颜色和不透明度信息,以便基于单个颜色图像创建场景的分层表示。该分层表示由空间分离的半透明间隔组成,这些间隔是复合到原始输入渲染的。在我们的实验中,我们表明,现有的单眼深度估计方法的适应性在半透明体积渲染上表现良好,该渲染在科学可视化领域具有多种应用。
translated by 谷歌翻译
破译神经网络内部运作的关键是了解模型学到了什么。发现学习特征的有前途的方法基于分析激活值,当前技术重点是分析高激活值,以在神经元水平上揭示有趣的特征。但是,分析高激活值限制了图层级概念发现。我们提出了一种方法,该方法将考虑整个激活分布。通过在神经网络层的高维活化空间内提取相似的激活曲线,我们发现了类似处理的输入组。这些输入组代表神经激活模式(午睡),可用于可视化和解释学习的层概念。我们释放一个框架,可以从预训练的模型中提取小睡,并提供可视觉内省工具,可用于分析午睡。我们通过各种网络测试了我们的方法,并展示了它如何补充现有的分析神经网络激活值的方法。
translated by 谷歌翻译
适当的重量初始化是成功培训神经网络的重要意义。最近,批量归一化通过基于批处理统计数据量化每层来判定权重初始化的作用。遗憾的是,批量归一化在应用于小批量尺寸时具有多个缺点,因为在点云上学习时需要应对内存限制。虽然良好的重量初始化策略可以不需要呈现批量归一化,从而避免这些缺点,没有提出这种方法对于点卷积网络。为了填补这一差距,我们提出了一个框架来统一众多持续卷积。这实现了我们的主要贡献,方差感知权重初始化。我们表明,此初始化可以避免批量标准化,同时实现相似,并且在某些情况下更好的性能。
translated by 谷歌翻译
由于多路径干扰(MPI),飞行时间(TOF)摄像机受高水平的噪声和扭曲。虽然最近的研究表明,2D神经网络能够以先前的传统最先进的(SOTA)方法胜过去噪,但已经完成了基于学习的方法的研究,以便直接使用存在的3D信息在深度图像中。在本文中,我们提出了一种在3D空间中运行的迭代去噪方法,该方法旨在通过启用3D点卷积来校正视图方向校正点的位置来学习2.5D数据。由于标记的现实世界数据稀缺了这项任务,我们进一步培训我们的网络,并在未标记的真实世界数据上培训我们的网络,以解释现实世界统计数据。我们展示我们的方法能够在多个数据集中倾斜SOTA方法,包括两个现实世界数据集和本文介绍的新的大规模合成数据集。
translated by 谷歌翻译
人类姿势估计的常规方法要么通过依靠许多惯性测量单元(IMU)或通过依赖外部摄像头来限制记录空间,要么需要高度的仪器。这些缺陷是通过从稀疏IMU数据中估计人姿势估计的方法来解决的。我们定义邻接自适应图卷积长期记忆网络(AAGC-LSTM),以基于六个IMU的人体姿势估计,同时将人体图形结构直接纳入网络。 AAGC-LSTM在单个网络操作中结合了空间依赖性和时间依赖性,比以前的方法更有效地内存。通过将图形卷积装置为邻接的适应性,这可以使其成为可能,从而消除了深层或经常性图网络中信息丢失的问题,同时还可以学习人体关节之间的未知依赖性。为了进一步提高准确性,我们提出纵向减肥来考虑自然运动模式。通过我们提出的方法,我们能够利用人体的固有图形本质,因此可以超越最稀疏IMU数据的人类姿势估计的最新状态(SOTA)。
translated by 谷歌翻译
由于深度学习(DL)的成功及其日益增长的就业市场,来自许多地区的学生和研究人员都有兴趣了解DL技术。在此学习过程中,可视化已被证明具有很大的帮助。虽然大多数当前的教育可视化针对一个特定的架构或用例,但是能够处理顺序数据的经常性神经网络(RNN)尚未覆盖。尽管诸如文本数据(如文本和功能分析)的任务处于DL Research的最前沿。因此,我们提出了Explornn,这是RNN的第一个交互式探索的教育可视化。在使学习更容易和更有趣的基础上,我们定义了针对理解RNN的教育目标。我们使用这些目标来形成视觉设计过程的指导。通过Explornn,它可以在线访问,我们在粗略级别提供RNN的训练过程概述,同时还允许详细检查LSTM单元格内的数据流。在一个实证研究中,我们在受试者设计中评估了37个科目,以研究与经典文本的学习环境相比的Explornn的学习结果和认知负荷。虽然文本组中的学习者在肤浅的知识获取中,但Explornn特别有助于更深入地了解学习内容。此外,Exprornn中的复杂内容被认为明显更容易,并导致比文本组更少的无关紧额。该研究表明,对于诸如经常性网络的困难学习材料,深度理解是重要的,诸如Explornn等交互式可视化可能会有所帮助。
translated by 谷歌翻译
Cybercriminals are moving towards zero-day attacks affecting resource-constrained devices such as single-board computers (SBC). Assuming that perfect security is unrealistic, Moving Target Defense (MTD) is a promising approach to mitigate attacks by dynamically altering target attack surfaces. Still, selecting suitable MTD techniques for zero-day attacks is an open challenge. Reinforcement Learning (RL) could be an effective approach to optimize the MTD selection through trial and error, but the literature fails when i) evaluating the performance of RL and MTD solutions in real-world scenarios, ii) studying whether behavioral fingerprinting is suitable for representing SBC's states, and iii) calculating the consumption of resources in SBC. To improve these limitations, the work at hand proposes an online RL-based framework to learn the correct MTD mechanisms mitigating heterogeneous zero-day attacks in SBC. The framework considers behavioral fingerprinting to represent SBCs' states and RL to learn MTD techniques that mitigate each malicious state. It has been deployed on a real IoT crowdsensing scenario with a Raspberry Pi acting as a spectrum sensor. More in detail, the Raspberry Pi has been infected with different samples of command and control malware, rootkits, and ransomware to later select between four existing MTD techniques. A set of experiments demonstrated the suitability of the framework to learn proper MTD techniques mitigating all attacks (except a harmfulness rootkit) while consuming <1 MB of storage and utilizing <55% CPU and <80% RAM.
translated by 谷歌翻译
We introduce a machine-learning (ML)-based weather simulator--called "GraphCast"--which outperforms the most accurate deterministic operational medium-range weather forecasting system in the world, as well as all previous ML baselines. GraphCast is an autoregressive model, based on graph neural networks and a novel high-resolution multi-scale mesh representation, which we trained on historical weather data from the European Centre for Medium-Range Weather Forecasts (ECMWF)'s ERA5 reanalysis archive. It can make 10-day forecasts, at 6-hour time intervals, of five surface variables and six atmospheric variables, each at 37 vertical pressure levels, on a 0.25-degree latitude-longitude grid, which corresponds to roughly 25 x 25 kilometer resolution at the equator. Our results show GraphCast is more accurate than ECMWF's deterministic operational forecasting system, HRES, on 90.0% of the 2760 variable and lead time combinations we evaluated. GraphCast also outperforms the most accurate previous ML-based weather forecasting model on 99.2% of the 252 targets it reported. GraphCast can generate a 10-day forecast (35 gigabytes of data) in under 60 seconds on Cloud TPU v4 hardware. Unlike traditional forecasting methods, ML-based forecasting scales well with data: by training on bigger, higher quality, and more recent data, the skill of the forecasts can improve. Together these results represent a key step forward in complementing and improving weather modeling with ML, open new opportunities for fast, accurate forecasting, and help realize the promise of ML-based simulation in the physical sciences.
translated by 谷歌翻译
Diffusion models have shown a great ability at bridging the performance gap between predictive and generative approaches for speech enhancement. We have shown that they may even outperform their predictive counterparts for non-additive corruption types or when they are evaluated on mismatched conditions. However, diffusion models suffer from a high computational burden, mainly as they require to run a neural network for each reverse diffusion step, whereas predictive approaches only require one pass. As diffusion models are generative approaches they may also produce vocalizing and breathing artifacts in adverse conditions. In comparison, in such difficult scenarios, predictive models typically do not produce such artifacts but tend to distort the target speech instead, thereby degrading the speech quality. In this work, we present a stochastic regeneration approach where an estimate given by a predictive model is provided as a guide for further diffusion. We show that the proposed approach uses the predictive model to remove the vocalizing and breathing artifacts while producing very high quality samples thanks to the diffusion model, even in adverse conditions. We further show that this approach enables to use lighter sampling schemes with fewer diffusion steps without sacrificing quality, thus lifting the computational burden by an order of magnitude. Source code and audio examples are available online (https://uhh.de/inf-sp-storm).
translated by 谷歌翻译
Recently, many causal estimators for Conditional Average Treatment Effect (CATE) and instrumental variable (IV) problems have been published and open sourced, allowing to estimate granular impact of both randomized treatments (such as A/B tests) and of user choices on the outcomes of interest. However, the practical application of such models has ben hampered by the lack of a valid way to score the performance of such models out of sample, in order to select the best one for a given application. We address that gap by proposing novel scoring approaches for both the CATE case and an important subset of instrumental variable problems, namely those where the instrumental variable is customer acces to a product feature, and the treatment is the customer's choice to use that feature. Being able to score model performance out of sample allows us to apply hyperparameter optimization methods to causal model selection and tuning. We implement that in an open source package that relies on DoWhy and EconML libraries for implementation of causal inference models (and also includes a Transformed Outcome model implementation), and on FLAML for hyperparameter optimization and for component models used in the causal models. We demonstrate on synthetic data that optimizing the proposed scores is a reliable method for choosing the model and its hyperparameter values, whose estimates are close to the true impact, in the randomized CATE and IV cases. Further, we provide examles of applying these methods to real customer data from Wise.
translated by 谷歌翻译